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Much real world decision making is based on binary categories of 
information that agree or disagree, accept or reject, succeed or fail and so on. 
Information of this category is the output of a classification method that is 
the domain of statistical field studies (eg Logistic Regression method) and 
machine learning (eg Learning Vector Quantization (LVQ)). The input 
argument of a classification method has a very crucial role to the resulting 
output condition. This paper investigated the influence of various types of 
input data measurement (interval, ratio, and nominal) to the performance of 
logistic regression method and LVQ in classifying an object. Logistic 
regression modeling is done in several stages until a model that meets the 
suitability model test is obtained. Modeling on LVQ was tested on several 
codebook sizes and selected the most optimal LVQ model. The best model of 
each method compared to its performance on object classification based on 
Hit Ratio indicator. In logistic regression model obtained 2 models that meet 
the model suitability test is a model with predictive variables scaled interval 
and nominal, while in LVQ modeling obtained 3 pieces of the most optimal 
model with a different codebook. In the data with interval-scale predictor 
variable, the performance of both methods is the same. The performance of 
both models is just as bad when the data have the predictor variables of the 
nominal scale. In the data with predictor variable has ratio scale, the LVQ 
method able to produce moderate enough performance, while on logistic 
regression modeling is not obtained the model that meet model suitability 
test. Thus if the input dataset has interval or ratio-scale predictor variables 
than it is preferable to use the LVQ method for modeling the object 
classification. 
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1. INTRODUCTION 


The input data for the classification or categorization of an object can be either an image or an 
attribute. Before being used as input, the image must go through a preprocessing stage called feature 
extraction. There are two processes at the phase of features extraction; Texture feature extraction and shape 
feature extraction [1], [2]. The method of categorizing or classifying an object based on the features or 
attributes attached to the object is a study in the field of statistics and machine learning. Applied from the 
method is very wide in various aspects of life that come from the field of exact, engineering, and social. The 
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most crucial thing is the information resulting from the classification used as the basis for decision-making or 
policy. Although the attributes of the object are very wide in scope, they are either tangible or intangible, 
observable or unobservable. However, in order to apply a method of classification, it is necessary to 
characterize the objects arranged in a particular data structure. The most commonly used data structure as an 
input argument of a classification method is an input-output pair. 

In the statistical field, the data in the input-output pair format should be considered as a causality 
where the set of observed values in the input attribute will determine the observed value of the output 
attribute [3]. The attributes of an object that has a special property that has a unique or single value on the 
object are known by the term variable. When an object is observed on the basis of 5 variables, it will get 5 
observation values associated with the object. These types of observational values have 4 types of 
measurement scales: nominal, ordinal, interval, and ratio. This type of measurement scale in statistics will 
greatly influence the selection of the most suitable analytical methods. Suppose that if the output variables 
(variables affected by the input variables) are nominal or ordinal, then the statistical modeling for the 
classification of objects is logistic regression [4], [5]. On the other hand, the machine learning method does 
not require a causality between the input-output variables and also does not concern the type of measurement 
scale in the output variable [6], [7]. 

Hand and Henley [8] have reviewed the methods used in object classification. They concluded that 
the classification methods which are easy to understand (such as regression, nearest neighbour and tree-based 
methods) are much more appealing, both to users and to clients, than are methods which are essentially black 
boxes (such as Artificial Neural Network). They also permit more ready explanations of the sort of reasons 
why the methods have reached their decisions. Meanwhile Dreiseitl and Ohno-Machado [9] sampled 72 
papers comparing both logistic regression and neural network models on medical data sets. They analyzed 
these papers with respect to several criteria, such as the size of data sets, model parameter, selection scheme, 
and performance measure used in reporting model results. They said that where performance was compared 
statistically, there was a 5:2 ratio of cases in which it was not significantly better to use neural networks. 

Performance improvement of logistic regression model on microarray data with the Bayesian 
approach to gene selection and classification using the logistic regression model. The method can effectively 
identify important genes consistent with the known biological findings while the accuracy of the 
classification is also high [10]. In addition, the performance comparison between ANN and logistic 
regression in various fields are done by Felicisimo, et al [11] did Mapping landslide susceptibility, M. 
Shafiee, et al [12] did Forecasting Stock Returns in Iran Stock Exchange, and Kamley S, et al [13] did 
Forecasting of Share Market. From various studies, ANN method used is back propagation or support vector 
machine. Both methods are widely used because it has a topology and learning methods that are easy to 
understand. On the other hand, there is also ANN method known as Learning Vector Quantization (LVQ) 
which is still rarely found applied, because this method has a competitive layer that works using the principle 
of the self-organizing map [14], [15] so it has a structure and learning method that is difficult to understood. 
LVQ is a classification method in which each unit of output represents a class that can be used for grouping 
where the number of target groups or classes is pre-determined. 

Based on the above exposure, this paper will examine the implementation of logistic regression and 
LVQ network for object classification in three datasets with input variables (predictors) with different 
measurement scales, respectively are intervals, ratios and nominal for data 1, data 2, and data 3. In the 
logistic regression modeling is done parameter estimation stage, testing the model parameters, then test the 
goodness of fit, finally obtained a suitable model. In LVQ network modeling the 4 codebooks are tested ie 2, 
10, 30, and 50. The best models produced by both logistic regression and LVQ networks will be evaluated 
for classification using Hit Ratio [16], ie the proportion of sample observations that can be classified by the 
classification model appropriately. Implementation of both methods using software R. 


2. LITERATURE REVIEW 
2.1. Binary Logistic Regression Analysis 

Binary logistic regression is a logistic regression with response variables that are categorical values 
of binary or dichotomous. The variable response of Bernoulli's Y distributes with the following probability 
functions [3]: 


fOD = n(n) sy, = 0,1 (1) 


The logistic regression model : 
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Pot djas PJ) ey (2) 


m(x) = oa 
k 1 4 eoj PD j=l, 2, 


Where: n=Number of observations 
p=Number of predictor variables 
Bo=Intercept 
6; =Logistic regression coefficient from the j th jth predictor variable 
x;;=The value of the j-th predictor variable on the i-th observation. 
While the logit form is: 


g(x) = (Bo + Ei Bixi ) (3) 


In logistic regression, the conditional distribution pattern of the response variable is 
Y= (x;) + £, which has 2 types of error: 
Y=1 then e=1-z (x) with probability 7 (x) 
Y=0 then e=-a (x) with a chance of 1-7 (x) 
So the error distribution has the mean equal to zero, variance {n (x) (l-n (x))} and follows the 
Binomial distribution. 


2.1.1. Estimation of Model Parameters 

The method to estimate the logistic regression model parameters is Maximum Likelihood 
Estimation (MLE). The model parameter is estimated from the B? = (Bo, Bi, Bo, -7 By) vector, the value BT 
is obtained by maximizing the likelihood function (L (B)) through derivation of its parameters. The likelihood 
function is a joint probability function of the variables x; and y;. The probability distribution function for 
each (x;,y;), is 


f(x) = n(x (1 nxi , i= 1,2,...,0 (4) 


exp (Bot Zins Bjx ji) 


where, m(x) = [expla 3?_, Bx] 


According to Hosmer and Lemeshow [4], if inter-observations are assumed to be independent, the 
likelihood function is the multiplication of each probability distribution in Equation (4). 


LCB) = Mia fD = Mia n — nl): 
€(B) = £n [L(B)] 
=r", Ly én ( TOi) ) + én(1—- n(x:))| 


1-1(xj) 
-1 
= ici [yi(Bo + pan Bixi) + €n(1 + exp(By + Xi B;Xji)) | 
n p n p 
=y: Bo +Y Bixi -tn 1 + exp Bo +Y Bixi 
i=1 j=1 i=1 j=1 


Max fn likelihood is obtained by derivating £(B) to B and equating it with zero. 


a eP) 
= Yie1 YjXji 7 pan Xji | 


exp(Bo+Et Bjxji) 
op 


1+exp(Bo+Zh_, BjXxji) 
0 = Mier Mix — Lisi Xji TX) 
0 = Yh x (Yi — n (xi)) (5) 


Since Equation (5) is non-linear, the solution of this equation becomes difficult to resolve 
analytically, requiring an iterative solution such as the Newton-Raphson method [5]. 
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2.1.2. Parameter Significance Testing 
a. Simultaneous Testing 

The simultaneous test is performed to examine the role of each predictor variable in the model 
simultaneously. Statistical hypotheses and test statistics are as follows [3]: 


Hypothesis: Ho: p4 = P2 = + = By = 0 versus Hy: at least one $; # 0; 
on [LL — 
G = —21n (22) = —2[InL)(B) — InL,(B)| (6) 


where: Lo: loglikelihood with Bj; j=1,2,...,.p 
Ly: loglikelihood without gl eer p 


The G value is compared with the statistic X(av,0.05) with degrees of freedom corresponding to the 
estimated number of parameters. Hy will be accepted if p-value is greater than the probability of doing type I 
error of a. 

b. Partial Testing 

Partial testing is used to test the effect of each parameter f; on the model individually or separately 
with regard to other parameters. The partial test results will show whether a predictor variable is eligible to 
enter the model or not. If hypothesi test yields £; significant, then an £; enter in model. Hypothesis: Ho: B;=0 
versus H;: Pj + 0. 

Wald statistic test, 

m=- 

SE(P;) 


The test statistic used is the Wald test, reject Ho if the value of \w;| >Z a-%4 oF p-value <0.05, so it 
2 


can be concluded there is influence between predictor variables with response variables [4]. In addition to 
following the normal distribution, the Wald statistical test squared (W)? will follow the chi-square X? with 
the degree of freedom one [3]. 


2.1.3. Goodness of Fit Test 

Model fit test on logistic regression, can use test statistic called goodness of fit test. This test statistic 
is used to find out how big the effectiveness of the model formed in explaining the response variable, so the 
model can represent the actual condition represented by the data used in the analysis. The hypothesis and test 
statistic are as follows [3]: 

Hypothesis: Hp: The model appropriate vs Hı: The model is not appropriate. 

The test statistic used is y? Pearson ie y? = DL, ef. Ho is rejected if X? > Xf4n—p-1)): 


2.2. Artificial Neural Network with Competitive Layer 

Artificial neural networks (ANN) with competitive layers have three layers: the input layer, the 
hidden layer, and the output layer. In this case the competitive layer lies in the hidden layer. Neurons in 
networks with competitive layers compete for active rights. One of the competitive layer network model is 
LVQ. There are two learning methods in ANN namely supervised learning and unsupervised learning. In 
Supervised learning, every pattern given as input for ANN, has been known output. The difference between 
the ANN output and the desired output (target) is called an error. This error quantity is used to correct ANN 
weight so that ANN can produce output as close as possible to known target pattern. ANN learning algorithm 
using this method is Hebbian, Perceptron, Adaline, Boltzman, Hopfield, Backpropagation, and LVQ. 

One of the aims of ANN modeling is for classification. According Fausett [6], the basis of 
classification on ANN is to use the optimal weight of the learning process. The weights are the amounts or 
values that exist on the connection between neurons that transfer data from one layer to another, which serves 
to regulate the network so as to produce the desired output. In addition to having advantages that do not have 
to meet the classical assumptions and varian homogenity of errors, ANN also has a weakness that takes 
longer training time to perform calculations in the formation of models [13]. 

LVQ is a classification method in which each output unit presents a class with a specified target 
class. LVQ uses a supervised competitive learning algorithm version of the Kohonen Self-Organizing Map 
(SOM) algorithm. LVQ network architecture according to Kaski and Kohonen [15] can be seen in Figure 1. 
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Figure 1. The architecture LVQ network 


where: R=number of elements in the input vector 
S'=the number of competitive neurons 
S*=number of linear neurons 


Based on Figure 1., the LVQ network consists of three layers: an input layer, a competitive layer, 
and an output layer. Learning in the competitive layer aims to classify each input vector to a hidden layer. 
While learning on the output layer aims at transforming the subclass on the competitive layer into the 
classification of the target class that has been established. Learning the competitive layer as a subclass and 
learning from the output layer as the target class. According to Putri (2012), there are two factors that 
influence the learning process in LVQ namely initial initialization and training rate. 

Setting data as input on LVQ network must be in input-output pair format. In this case the output 
data will serve as a target in the learning process. Suppose in the format as follows: 


{s@:t@}, q=l, 2,..., Q " 


Where: s‘=vector/input matrix 
t) =the output vector 


LVQ consists of a competitive layer that includes a competitive subnet and a linear output layer. In a 
competitive layer, each neuron is assigned to a class. Different neurons in the competitive layer, it is possible 
to have the same class. Each class is then paired with one of the neurons in the output layer. Thus the number 
of neurons in the competitive layer, at least as much as the number of neurons in the linear output layer [14]. 
The relationship between the input vector and one of the weight vectors is measured by the Euclid distance. 
A subnet is used to find the smallest element in the input data. 


lx- wy" | 
nO = |x- Ww." (8) 
lx- wo" || 


An element given value 1 indicates that the input vector belongs to the intended class, and an 
element is assigned a value of O if the input vector is not included in the desired class. This can be 
represented by a subnet as a vector with the following vector functions: 


“compet (n) (9) 


a 
Linear output layer of LVQ network, used to combine subclasses into a single class. This is done 
using the weight matrix W®), ie the weight matrix having elements: 


1, if neuron i included in a subclass j 
Wy to if neuron i excluded in adubclass j am 

In addition, the weight matrix W®, on the competitive layer must be trained using the Kohonen 
SOM tule as follows: 

At each iteration, each training vector is entered into the network as input x and the Euclid distance 


from the input vector to each prototype vector (weighted matrix column) is calculated. Neuron j* wins the 
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competition if Euclid's distance between x and j* prototype vector is the smallest. The activation value a‘” is 

multiplied by n W® in its right position to obtain input n ©. The output a°=n, as long as the transfer 

function in the output neuron is an identity function. The a” also has only one value in the element k*, 

indicating that the input vector belongs to the class k*. Kohonen rules are used to fix the weights on the 
(1) 


hidden layer. If x is correctly classified, the weight vector w J is the winner so that the hidden neuron is 


moved closer to x. 


(4) _ D); 2s 3 
Awi, = a(x —wy,’) if (ap = tẹ = 1) (11) 

But if x is classified incorrectly, it is obvious that the wrong hidden neurons win the competition. In 
this case, the weight is moved away from the x. 


Aw = -a(x -wS ) if (aj = ty + 1) (12) 
After the training, the final weights (w) will be used for the next simulation, test or 
classification [15]. 


2.3. Accuracy of Classification 

According to Dianiati (2013), prior to classification, the data is divided into two datasets. The first 
part is the training dataset used to form the optimal model of artificial neural networks, while the second part 
is the testing dataset to test the optimal model obtained from the training dataset. Hair, et al.[5] explains the 
principle of sharing the most popular proportion of training-testing datasets is 50-50, but most researchers 
also use the 60-40 or 75-25 division principle, since there is no standard rule about dividing the dataset. The 
precision in classification can be determined by calculating the value of Hit Ratio, ie the proportion of 
observational samples that can be classified by the classification function [16]. The value of Hit Ratio can be 
calculated using the following formula: 


r A number of objek classified accuratel 
Hit Ratio = ET EE COON: X 100% 


(13) 


total sampel 


The Hit Ratio value calculated according to the Equation (13) shows performance of classification 
function. A biger Hit Ratio value indicates a better classification method. 


3. RESEARCH METHOD 
3.1. Data Characteristics 

The data used are secondary data obtained from three previous studies. The three datasets have 
different response and predictor variables. The explanation of the data used is shown in Table 1. 


Table 1. Predictor and Response Variable of the Data 


Data Predictor variables Predictor Categorical respon 
Dataset : record 
sources scale variable 
X,=Carbohydrates Interval Bad diet 0 
Sete X,=Vegetables 
Diet on toddlers Sartika -y “Side dish bins 108 
[17] g Diet is enough 1 
X4=Fruit 
X,=Milk 
X,=Experience Rasio Do not accept 0 
. m X,=Duration of education credit 
The granting of Sunadji X.<Empl tintensii 
credit for seaweed [18] Don A 138 
business X4=Age Accepting credit 1 
X;=Seaweed cleanliness level 
X,==Seaweed Water Content 
X,=Mother Age Nominal Case of LBW 1 
Factors that ` 
; 3 X=Parity 
influence the Pandin ; 
Woe X3=Birth distance 
incidence of low [19] x Not a case of 96 
X eee ; X,=Anemia 2. 
birth weight infants X.<Nutrition Status LBW 
(LBW) s=Nutrition Status 


X,=Education 
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3.2. Research Stages 
In this study, the first step is to divide each data into 2 parts, 70% for training and 30% for testing 

dataset, then continued logistic regression analysis and LVQ analysis. Binary logistic regression analysis was 
performed using software R. The steps in binary logistic regression analysis were 
Check for the presence or absence of multicollinearity between independent variables X 
Parameter estimation 
Testing parameters simultaneously 
Partial parameter test 
The establishment of a logistic regression model 
Check the suitability of the model 
Apply binary logistic regression model obtained from training data to testing data 
Forming a table of precision classification of training and testing models. 

While the analysis of LVQ is done using package class on software R. The steps in LVQ analysis 


mia mo aoge 


are: 
a. Forming an input matrix on training data and forming a vector or classification factor for training data 
b. Initialization weights of the LVQ network 
c. Determine the learning rate (a) 
d. Renew weight on the competitive layer to obtain optimum weight 
e. Form an optimal architecture 
f. Re-classified the testing dataset based on the best architecture of the LVQ method formed from the 
training dataset. 
After the completion of the LVQ analysis done then calculated indicator of classification accuracy 
that is Hit Ratio and next to compare value of Hit Ratio from both methods. 


4. RESULTS AND ANALYSIS 
4.1. Logistic Regression Modeling 

The process of data analysis begins with multicolinearity testing among the predictor variables on 
each dataset. In the three datasets used in this study, there is no multicollinearity that is indicated by the VIF 
value greater than 10. The modeling process in logistic regression of the three datasets can be continued by 
estimating the model parameters of each dataset. Below is the parameter model estimator obtained by the 
maximum likelihood method, presented in Table 2. 


Table 2. The Parameter Estimation Results on All Datasets 
Values of estimates parameter 


Predictors 


Dataset 1 Dataset 2 Dataset 3 
XxX, 5.760 0.234 0.619 
Xz 2.184 0.002 -0.120 
X3 2.279 0.035 -0.191 
Xa 4.415 -0.089 -1.382 
Xs 4.955 0.370 1.092 
X6 -2.976 2.332 
Constant -100.753 45.289 2.153 


The full model formed for dataset 1, dataset 2, and dataset 3 respectively are 


g(x) = -100.753 + 5.76 X, + 2.184 X, + 2.279X; + 4.415 X, + 4.955 X; . 
g(x) = 45.289 + 0.234 X, + 0.002 X, + 0.035 X; — 0.089 X, + 0.370 X; — 2.976 Xz 
g(x) = 2.153 — 0.619 X,, — 0.120 Xp, — 0.191 X31 — 1.382 X4, + 1.092 Xe, — 2.332 X61. 


Tests on parameter estimators are simultaneously performed to determine the effect of predictor 
variables contained in the logistic regression model to the response variable as a whole or together. This test 
is based on the ratio test statistic (G) likelihood ratio test. Here is the hypothesis used: 


Ho: By = P2 = + = Bp = O versus Hy: at least Bj #O = j=l, 2, ..., p 


The test results against parameter estimators simultaneously on the datasets 1, 2, and 3 as shown in 
Figure 3. 
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Table 3. Simultaneously Parameter Testing of All Datasets 


-2 log likelihood Likelihood 
Dataset Model Model Ratio Test p-value 
intersep parameter 
1 -43.801 -9.750076 68.103 <0.000 
2 -61.508 -7.105 108.805 0.000 
3 -46.374 -37.854 17.039 0.009 


Based on the testing of parameters simultaneously in Table 2. for dataset | it can be seen that the p- 
value of the likelihood ratio test is <0.000. The value is less than the level of significant a is 0.05, so it was 
decided to reject Hp which means that carbohydrates, vegetables, side dishes, fruits, and milk together 
significantly affect the diet status of children under five. The result of statistical test on dataset 2 can be seen 
that p-value of likelihood ratio test is 0.000. The value is less than the significant level of a is 0.05, so it is 
decided to reject Hp which means that experience, duration of education, labor outpour, age, level of seaweed 
cleanliness, and seaweed contents significantly affect farming credit to farmers seaweed. While the test 
results in data 3 can be seen that the p-value of the likelihood ratio test is 0.009. The value is less than a is 
0.05, so it is decided to reject Hp which means that maternal age, parity, gestational distance, anemia, 
nutritional status, and education together have a significant effect on the incidence of low birth weight babies. 

To determine the predictor variables that significantly influence the response variables, it is 
necessary to test the significance of the parameters in each predictor variables using Wald (W)) test statistic. 
The statistical hypothesis tested is Ho: B=0 versus H,: B; # 0. Wald test statistic is Chi-square distributed with 
one degree of freedom. Based on p-value on Wald test statistic, for the first data, it was found that X, 
(vegetable) and X; (side dish) variables did not significantly affect the classification of infant diet. In the 
second data, only predictor X; (level of seaweed cleanliness) has a significant effect on the determination of 
credit for seaweed farmers. As for the third data, it is known that the X; (maternal age), X, (parity), X; (birth 
distance), and X; (nutritional status) did not significantly affect the classification of low birth weight babies. 
The logistic regression model that is formed based on significant predictor variables are as sown in Figure 4: 


Table 4. The Logistic Regression Model of All Datasets and Goodness of Fit Test 


Dataset The final model of logistic regression Chi-Square df p-value 
1 g(x) = —61.479 + 5.138 X, + 3.815 X, + 2.893 X; 2.897 8 0.941 
2 g(x) = —16.015 + 0.178X; 7.305 1 0.007 
3 g(x) = —2.429 + 1.442 X41 + 2.264 X61 0.518 2 0.772 


Testing the suitability (goodness) model used to determine whether the resulting model is 
appropriate (feasible). The statistical hypothesis used in this test is: Hg: fı = fa (observation 
frequency=expected frequency) versus H,: fa # f4 (observation frequency # expected frequency). Based on 
Table 3. p-value for the 1st and 3rd data has a value of more than 0.05 so the decision is to receive Ho and it 
can be concluded that the binary logistic regression model generated is good (appropriate), the model has 
been sufficient to explain the first data (classification of infant diet), and the 3rd data (classification of low 
birth weight infants). For the 2nd data because p-value has a value less than 0.05 it can be concluded that the 
binary logistic regression model generated in the 2nd data has not been suitable or not enough to explain the 
data. Based on these results, statistically, logistic regression models of the 1st and 3rd data can be used for 
object classification and should be able to produce fairly good classification accuracy. 


4.2. Learning Vector Quantization (LVQ) Optimum Model 

To obtain an optimal LVQ network model, LVQ network modeling process is performed on a 
various number of neurons in a hidden layer called codebook. The amount of codebook used is 2, 10, 30, and 
50. The size of the codebook will determine the weight matrix dimension to be calculated to obtain optimal 
LVQ network. Based on the dimension of this weight matrix, then the weights are randomly initialized. The 
optimal weight will be obtained through the training process by utilizing the training data input. After all the 
connecting weights between nodes in different layers of the LVQ network have been obtained, the network 
output can be obtained by inputting the input data into the network. If used as an input argument is training 
data then obtained the output of training data. Similarly, if used as an input argument is testing data then 
obtained the output of testing data. The value of Hit Ratio can be calculated from the output obtained from 
the network, either output of training or testing data. The initialized weights of the four codebooks for the 
first data are shown in Table 5. 
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Table 5. The initial weights between input and hidden layer of the first dataset 


Size Neuron 1 1 1 1 1 
Codebook Hidden Wi W? WP wP w? 
2 1 4.73 3.58 5.01 4.18 5.43 
2 6.70 4.76 4.33 5.00 5.24 

1 5.34 2.62 4.42 3.36 3.78 

10 2 4.58 3.71 3.30 3.30 5.00 
10 6.70 4.40 4.83 3.34 7.50 

1 5.34 2.62 4.42 3.36 3.78 

30 2 4.73 3.04 2.80 2.34 3.68 
30 6.10 4.14 5.49 5.28 5.00 

1 4.73 3.58 5.01 4.18 5.43 

50 5.60 4.50 2.84 3.30 4.32 


50 618 491 519 400 543 


As explained in the previous session that data 1 has 5 input variables considered to have an effect on 
the response variable. In codebook=2, it must be initialized a 5x2 matrix whose elements are the connecting 
weights between the input layer and the hidden layer, and a 2x1-sized vector whose elements are the 
connecting weights of the hidden layer to the output layer, whereas in the codebook=50, The dimensions of 
the weighted matrices and vectors to be initialized are 5x50 and 50x1 which are the connecting weights 
between the input layer and the hidden layer, and the connecting weights of the hidden layer to the output 
layer. After the training process, finally we get the optimal weight which some of the elements are presented 
in Table 6 as follows: 


Table 6. The Final Weights between Input and Hidden Layer of the First Dataset 
Size Neuron 
Codebook Hidden 1 


2 2 6226 4735 5.110 4.902 6.293 
I 5141 3114 4345 3810 3920 

m 2 4912 3760 3036 3047 4457 
10 5939 4835 4285 3.892 7.170 

1 5340 2620 4420 3360 3780 

om 4.730 3.040 2.800 2340 3.680 
30 5.756 4469 4.809 5.509 5.255 

1 4947 3848 5303 4.085 5.130 

ah 5.600 4.500 2.840 3.300 4.320 


50 6.306 5.136 5.327 4.076 5.739 


Table 7. Hit Ratio for All Dataset of Various Codebook Size 


Data Codebook a Hit Ratio l 

size Training Data Testing Data 

2 84.21 84.4 

1 10 92.1 81.25 

30 93.4 78.12 

50 96.05 78.12 

2 67.01 75.61 

2 10 80.41 65.85 

30 83.5 70.73 

50 91.75 73.17 

2 61.2 37.93 

3 10 74.62 44.82 

30 76.11 55.17 

50 76.12 44.82 


The final weight of LVQ that has been obtained from LVQ network learning process using training 
data is then used as network weight. Thus the LVQ Network already has the connecting weights of each node 
between the layers, so that the LVQ network is ready to be used as a model for object classification. Table 8 
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is an indicator of LVQ network performance that is Hit Ratio value from three datasets and various 
codebook. 

Based on the value of Hit Ratio in Table 7 we get the best LVQ network model for dataset 1 and 
dataset 2 is the model with codebook=2, whereas, in dataset 3, the best model is codebook=30. The models 
are said to be the best models because they have the greatest Hit Ratio value in the data testing. If Table 7 is 
observed further, it can be said that the increase in the number of codebooks is also followed by the increase 
of Hit Ratio value in training data, but the incidence does not apply to the value of Hit ratio in data testing. 
This phenomenon is known as overfitting. 


4.3. The Comparison of Accuracy Classification between Logistic Regression and LVQ 

The performance of the two methods of classifying objects is compared by the value of Hit Ratio 
calculated on both training and testing datasets of all the best models. The results of the classification 
accuracy of all datasets are presented in Table 8 below: 


Table 8. The Acuracy of the Best Model from Both Logistic Regression and LVQ 


Data Training Testing 
Logistic Regression LVQ Logistic Regression LVQ 
90.8 84.21 84.4 84.4 
- 67.01 - 75.61 
65.7 76.11 55.17 55.17 


Based on the percentage of accuracy of classification results in Table 8. it can be seen that in the 
second dataset (granting of seaweed farming), logistic regression method can not be calculated classification 
accuracy because the model obtained from the data does not meet the model fit test. This is supported by 
partial parameter test result only obtained one predictor variable (level of seaweed cleanliness) which have 
significant effect to response variable (credit approval decision to seaweed farmer). Keep in mind that the 
second dataset has predictor variables that are all scalable ratios. Different accuracy results are found in the 
LVQ model that is in the second dataset still obtained the value of Hit Ratio, both in the training and testing 
dataset of moderate enough size of 67% and 75% respectively. 

Modeling on the first dataset is a case that ideally demonstrates that logistic regression model 
performs equally well compared to LVQ model based on Hit ratio on dataset testing=84%. Having studied 
more deeply to this logistic regression model, it turns out in the process of diagnostic examination of the 
error obtained the result that the error of this model is able to meet all the assumptions in the regression 
modeling. The assumptions are error independenly each others, error has a constant varian, and error has 
normal distribution. The most difficult thing to be met with regression analysis is the normality assumption 
of the distribution of error. 

In the third dataset obtained the performance of both models are just as bad that is shown by the 
value of Hit ratio on dataset testing=55%. It should be noted carefully that the predictor variables are all 
nominal scale measurements. In the logistic regression model, only two categories of predictors (ie, anemia 
and low educated mothers) had a significant effect on low birth weight (LBW) infants. This implies that 
logistic regression modeling and LVQ network on predictive variables of nominal scale require more factors 
that influence the response variable. Although the LVQ model on the training dataset has Hit Ratio=76%, the 
model also remains unable to classify the test dataset satisfactorily. 


5. CONCLUSION 

The measurement scale of the predictor variable is very influential on the modeling and performance 
of the classification model, both logistic regression model, and LVQ model. In the interval-scale predictor 
variable, the best model of both methods produces an equally high accuracy of Hit Ratio=84%. In the 
nominal-scale predictor variable, the classification accuracy of the best model in both methods is similarly 
low: Hit Ratio=55%. While on Ratio-scale predictor variable, logistic regression modeling did not produce 
the best model, But the resulting LVQ model has an accuracy for fairly moderate object classification, ie Hit 
Ratio=75%. 
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